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Autonomous  systems  require  the  ability  to  plan  effective  courses  of  action  under  potentially  uncertain  or 
unpredictable  contingencies.  Effective  planning  requires  knowledge  of  the  environment,  and  if  the  environment  is 
too  complex  or  changes  dynamically,  goal-driven  learning  with  reactive  feedback  becomes  a  necessity.  This  papet 
addresses  the  issue  ot  learning  by  experimentation  as  an  integral  component  of  PRODIGY,  a  flexible  planning  system 
augmented  with  capabilities  for  execution  monitoring  and  dynamic  replanning  upon  receiving  adverse  feedback. 
PRODIGY  encodes  its  domain  knowledge  as  declarative  operators,  and  applies  the  operator  refinement  method  to 
acquire  additional  preconditions  or  postconditions  for  its  operators  when  observed  consequences  diverge  from 
internal  expectations.  When  multiple  explanations  for  the  observed  divergence  are  consistent  with  the  existing 
domain  knowledge,  experiments  to  discriminate  among  these  explanations  arc  generated.  Thus,  experimentation  is 
demand-driven  and  exploits  both  the  internal  state  of  the  plai  ner  and  any  external  feedback  received.  A  detailed 
example  ol  integrated  e.xjK'rimeni  formulation  in  presented  as  the  basis  for  a  systematic  approach  to  extending  an 
incomplete  domain  theory  or  correcting  a  potentially  inaccurate  one. 
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1,  Irsf rodtaction:  The  Need  for  Reactive  Experimentation 

Learning  in  the  context,  of  problem  solving  can  occur  in  multiple  ways,  ranging  from  macro-operator  formation 
(Pikes,  1971,  Minton,  1985,  Cheng  &  Carbonell,  1986)  and  generalized  chunking  (Laird  et  al,  1986),  to  analogical 
transfer  of  problem  solving  strategies  (Carbonell,  198.3,  Carbonell.  1986,  1986)  and  pure  analytical  or  explanation- 
driven  techniques  (Mitchell  et  al,  1986,  DeJong  &  Mooney,  1986,  Minton  &  Carbonell,  1987).  All  of  these 
techniques,  however,  focus  on  the  acquisition  of  control  knowledge  to  solve  problems  faster,  more  effectively,  and 
to  avoid  pitfalls  encountered  in  similar  situations.  Newly  acquired  control  knowledge  may  be  encoded  as  preferred 
operator  sequences  (chunks  and  macrooperators).  improved  heuristic  left-hand  sides  on  problem  solving  operators 
(as  in  LEX  (Mitchell  et  at,  1983)),  or  explicit  search-control  rules  (as  in  prodigy  (Minton  et  at,  1987)). 

However  important  the  acquisition  of  search  control  knowledge  may  be,  the  problem  of  acquiring  factual  domain 
knowledge  and  representing  it  effectively  for  problem  solving  is  of  at  least  equal  significance.  Most  systems  that 
acquire  new  factual  knowledge  do  so  by  some  form  of  inductive  generalization2,  but  operate  independently  of  a 
goal-driven  problem  solver,  and  have  no  means  of  proactive  interaction  with  an  external  environment  (with  the 
exception  of  some  learning  work  in  roboucs  and  the  world  modelers  project  (Carbonell  Sc  Hood,  1986)).  When  one 
observes  real-world  learners,  ranging  from  children  at  play  to  scientists  at  work,  it  appears  that  active 
experimentadon  plays  a  crucial  role  in  formulating  and  extending  domain  theories,  whether  everyday  -naive"  ones, 
or  formal  scientific  ones.  Many  actions  are  taken  in  order  to  gather  information  and  leant  whether  or  not  predicted 
results  come  to  pass,  or  unforeseen  consequences  occur.  Of  course,  expenmentadon  can  yield  search  control 
preferences,  as  well  as  factual  knowledge,  as  we  see  in  our  later  example.  The,  focus  of  this  chapter  is  on  experiment 
formuladon  and  analydeal  iruerpretauon  in  the  context  of  PRODIGY  (Minton  &  Carbonell,  1987,  Minton  <rr  al,  1987), 
an  interactive  planning  system,  rather  than  on  empirical  interpretadou  of  results  from  pre-form ulated  cxyerimenis  in 
a  single-pass  lcaning-by-discovery  approach  typical  of  systems  such  as  bacon  (Langley  et  al,  1983)  and  abacus 
(Falkenhamer  &  Mich  a1  ski,  1986). 

In  order  to  endow  a  problem  solver  with  the  capability  to  experiment  on  the  external  world,  we  start  by 
interleaving  planning  and  execution  monitoring,  so  that  external  feedback  is  immediate.  If  the  plan  does  not  unfold 
as  expected  (e  g.,  unfore:  en  interactions  take  place,  actions  have.  unexjiecied  consequences,  etc.)  the  system  replans 
dynamically  using  bettci -known  methods,  or  suspends  planning  m  order  to  determine  the  source  of  the  discrepancy 
Hem  i.s  where  experimentation  is  triggered:  divergence  from  expected  results  that  interfere  with  carrying  out  a  plan 
for  the  active  goal.  The  objective  of  the  experiment  i.s  to  augment  the  domain  theory  (e.g.,  record  previously 
unknown  consequences,  alter  determining  what  conditions  arc  needed  to  bring  them  about),  or  to  correct  that 
domain  theory  te.g.,  deleting  or  altering  the  expected  effects  or  applicability  conditions  oi  ojx-nitors,  in  order  to 
force  the  internal  model  to  accord  wuh  external  reality),  i  xpenmcntiuoii  is  u-.ui  to  isolate  die  tau.se.  ol  each 
discrepancy,  and  make  [he  minimal  modification  possible  to  die  interna!  monel  in  older  to  I'siabh  .h  external 
consistency  Moseuvei,  this  metapnneipie  (>!  ’'cognitive  inertia"  dict.iies  thai  monohniii  changes  (adding  new 
intoiniaiion)  he  pie  In  red  over  nonmonotonic  ones  (changing  previous  intoimaiion)  d  both  are  ol  equivalent  scope 
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2.  Other  Research  in  Learning  by  Experimentation 

Machine  Learning  has  not  yet  addressed  centrally  the  topic  of  learning  by  active  experimentation,  although  tliere 
lias  been  related  work  in  .scientific  discovery  and  more  recently  some  attempts  to  address  directly  the  issue  of 
experimentation. 

The  ba  "ON  and  G  laud  HR  systems  (Langley  et  ai ,  1986)  arc  able  to  discover  qualitative  or  quantitative  empirical 
laws,  focusing  on  the  empirical  interpretation  of  results  from  pre-form ularcrl  experiments.  The  authors  have 
proposed  combining  these  systems,  having  GLAUBER  provide  bacon  with  some  qualitative  information  about  the 
data,  bacon  would  then  be  able  to  acquire  data  on  its  own  by  formulating  experiments.  FAHRENHEIT  (Koehn  & 
Zytkow,  1986)  designs  limited  experiments  in  ’.tmis  of  quantitative  values  of  the  experiment’s  parameters  to 
determine  die  scope  of  a  law  given  by  bacon. 

Lenac’s  AM  and  eurisko  systems  (Lenat  1983)  car.  be  said  to  experiment,  but  in  a  limited  sense  Both  utilize 
heuristics  that  change  internal  concepts  which  are  then  tested  for  "inierestingness",  but  not  necessarily  for  external 
validity.  In  its  symbiotic  mode,  however,  eurisko  received  feedback  from  the  user  (Doug  Lenat),  and  was  closer  to 
a  full  experimentation  system. 

In  t-EX,  Mitchell  uses  a  limited  form  of  experimentation  in  to  generate  problems  in  symbolic  integration  that 
formulate  desirability  conditions  for  when  to  select  problem  solving  operators  (Mitchell  et  ai,  1983).  His  primary 
experiment  generation  method  is  to  compose  a  problem  that  would  maximally  reduce  the  version  space  of  possible 
desirable  application  conditions  for  the  operator  in  question. 

In  some  preliminary  work.  (Langley  &  Nordhauscn,  1986)  in  the  ids  system  investigate  experimentation  in  a 
qualitative  physics  framework.  Also  in  initial  stages  of  investigation,  Kulkaim  and  Simon  (Kulkarai  &  Simon, 
1987)  are  developing  general  and  domain-dependent  heuristics  for  scientific  experimentation,  and  Shew  (Shen, 
1987)  is  developing  similar  methods  for  naive  experimentation. 

The  ADEPT  system  (Rajamoncy,  1986)  is  concerned  with  experimentation  in  domains  with  incomplete  or 
inconsistent  theories.  The  domain  knowledge  is  expressed  in  terms  ol  qualitative  physics.  When  a  contradiction 
arises  in  the  process  of  explaining  an  observation,  die  system  uses  a  set  ol  lielicts  to  purpose  .some  hypotheses. 
Several  kinds  of  crpenrncius  are  puqiosed  to  test  these  hypotheses.  The  design  of  an  experiment  is  made  following 
an  algontfim  that  depeml.s  u'  the  type  of  experiment,  and  thai  algciuhm  determines  die  necessary  pieces  ol 
ini'oi (nation  associated  widi  the  experiment  to  lie  performed.  tllurnaiely  the  system  .votdd  design  e .xperimemx  thm 
allow  (he  consraiciton  of  cxphmai ions  m  Hi!  with  incorrect  domain  theories. 
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3.  Background:  The  Role  of  Experimentation  in  PRODIGY 

The  PRODIGY  system  (Minton  e t  al ,  1987,  Minton  &  Carbone)!,  1987)  is  a  general-purpose  planner  at  CMU  that 
serves  as  the  underlying  basis  for  much  machine-learning  research.  In  essence,  PRODIGY  learns  incrementally 
through  experience  in  solving  increasingly  more  complex  problems  in  a  task  domain,  and  gradually  transitions  from 
naive  student,  to  apprentice,  to  journeyman,  and  eventually  (we  hope)  to  domain  expert  Thus  far  we  have 
experimented  successfully  with  a  version  of  explanation-based  learning  (EBL)  (Mitchell  et  aL  1986)  that  can  lean) 
from  failed  instances  (to  avoid  future  failures  that  share  the  same  underlying  cause)  and  goal  interactions,  as  well  as 
the  standard  EBL,  .based  on  deducuvely  provable  generalization  from  positive  instances.  We  are  also  studying  the 
role  of  case-based  learning  in  prodigy,  and  are  exploring  interactive  knowledge  acquisition  from  a  domain  expen 
who  looks  over  the  proverbial  shoulder  of  die  planning  system,  making  concrete  suggestions  on  the  current  plan 
being  synthesized,  and  occasionally  providing  more  genera'  advice. 

Whereas  experimentation  in  its  broadest  sense  can  be  a  very  powerful  and  general  learning  method,  here  we 
confine  our  study  to  a  very  concrete  type  of  experimentation:  operator  refinement.  In  essence,  we  assume  that  the 
domain  knowledge  is  encoded  as  a  set  of  declarative  operators  and  inference  rules.3  Presently,  learning  is  confined 
to  the  acquisition  of  new  pre  and  post  conditions  for  existing  operators,  which  start  as  approximations  of  external 
reality  and  are  refined  to  match  that  reality  whenever  discrepancies  occur  between  internal  expectations  and  external 
observation.  Later  we  hope  to  extend  the  method  to  die  acquisi  tion  of  new  domain  operators. 

Experimentation  may  be  targeted  at  the  acquisition  of  different  kinds  of  knowledge,  though  augmentation  of  an 
incomplete  domain  theory  (via  refinement  of  operators)  is  our  current  focus  of  attention: 

«  Experimentation  to  acquire  and  refine  control  knowledge.  When  multiple  sequences  of  actions  appear 
to  achieve  the  same  goal,  experimentation  and  analysis  are  required  to  determine  which  plan  is  the  most 
cost-effective  or  robust  one,  and  to  generalize  and  compile  the  appropriate  conditions  so  as  to  formulate 
the  preferred  plan  in  future  problem  solving  instances  where  the  same  goal  and  relevant  initial 
conditions  are  present.  Thus,  experimentation  may  be  guided  towards  producing  far  more  effective  use 
of  existing  domain  know  ledge. 

•  Experimentation  to  augment  an  incomplete  domain  theory.  Experiments  may  be  formulated  to 
synthesize  new  operators,  learn  new  consequences  of  existing  operators  or  determine  previously 
unknown  interactions  among  existing  operators.  Also,  performing  known  actions  on  new  objects  in  the 
task,  domain  in  a  systematic  manner,  and  observing  then  consequences,  serves  to  acquire  properties  of 
these  new  objects  and  classify  them  accotding  to  pragmatic  cmciia  determined  by  the  task  domain. 

Thus  experimentation  may  lx:  guided  towards  acquiring  new  domain  knowledge  from  the  external 
environment, 

*  Experimentation  to  refine  an  incorrect  domain  theory.  No  comprehensive  theory  is  ever  perfect,  a 

the  hi.su.rry  o(  science  inloims  us,  whether  a  be  Newton's  laws-  ot  motion  or  u.  >ic  ill 'Structured  donum 
theories  embedded  in  the  knowledge  bases  ot  expert  systems,  tlowevci,  pmoally  correct  theon.es  otten 
prove  useful,  and  are  giadunHy  improved  to  match  external  reality  (and  are  onasion.-tlly  totally  icplaced 
by  a  newer  concept  ii<>i  strut. true)  lb- re  we  deal  only  with  niinoi  •"lots  <>l  couiniission  in  the  donum 
theory  which  when  Is  vail  v  conectcd  improve  global  [*  lotmance  We  believe  aitiofn  tfed  knowledge 
fftim.-mc.sH  a  very  mipoftant  v:p**ct  ni  aulrlnoinous  i  vt:  .  h>.  tc ti  uivr  . r •  *,* ; i f t si  m  \ t ,  ,:!iit  ■  *i jo 

writ*!:.-  ajt's 'f.s.s  is  >  ill y  i !ov:‘i  at  hano  than  th  '■  and  t  idoiii'y  ;. i»  nuntered 
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guided  at  incremental  correction  of  a  domain  theory.4 

Our  central  concern  is  the  development  of  a  method  to  generate  operational  hypotheses  (those  that  can  be  tested 
through  an  external  experiment)  to  account  for  unexpected  divergence  between  predicted  and  observed 
consequences.  Experimentation  is  invoked  when  such  a  divergence  prevents  the  formulation  of  a  plan  to  solve  die 
problem  at  hand;  tiius  "idle  curiosity"  Is  not  our  target.  Moreover,  the  entire  planning  context  is  used  to  formulate 
and  guide  die  experiment,  in  order  to  focus  on  die  most  direct  and  economical  way  of  inferring  die  missing 
knowledge.  Concessions  must  be  made  to  other  protected  goals  in  the  course  of  die  experimentation;  assuring  safety 
of  the  experimenter,  not  consuming  a  resource  in  the  experiment  that  will  be  required  to  carry  out  the  rest  of  the 
plan,  etc.  Thus,  experiment  formulation,  once  invoked  ith  the  appropriate  constraints,  becomes  itself  a  meta¬ 
problem  amenable  to  all  the  methods  in  the  general  purpose  planner.  The  EBL  method  (or  perhaps  a  similarity- 
based  method  -  SBL)  may  then  be  invoked  to  retain  not  just  the  result  of  the  instance  experiment,  but  its  nrovably 
correct  generalization  (or  empirically  appropriate  one  if  SSL  is  used). 


4.  The  Base-Level  System;  Knowledge  Required  for  Planning 
Consider  an  example  domain  of  expertise:  crafting  a  primary  telescope  mirror  from  raw  materials  (such  as  pyrex 
glass,  pure  aluminum,  distilled  water,  etc.)  and  pertinent  tools  (such  as  grinding  equipment,  aluminum  vaporizers,5 
etc.).  The  operators  it  die  domain  include:  GRJND-CONCaVE,  POLISH,  ALUMINIZE,  and  CLEAN.  A  complete 
domain  theory  would  include,  in  addition  to  these  four  operators  themselves,  knowledge  of: 

•  all  the  relevant  preconditions  for  each  operation,  to  proceed  successfully, 

•  all  the  consequences  of  applying  each  operator  (stated  as  changes  to  the  global  world  state), 

•  and  all  die  objects  to  which  these  operators  may  be  applied  to  achieve  he  desired  effects  (for  instance, 
wood  may  he  ground  into  a  concave  shape,  but  the  result  wouid  not  N:  an  optical-quality  telescope 
mirror). 


hi  addition  to  the  domain  theory,  an  optimal-performance  system  needs  to  know  control  rules  than!  and  fast  ones, 
as  well  as  heuristic  ones).  1he.se  rules  perform  the  following  tasks: 

■»  When  multiple  goals  an:  present,  determine  which  goals  to  work  on  Iiim  o<  which  ones  10  work  on  at 
all.  For  instance,  d  the  goals  is  polished  and  is- ground- conca ve  are  both  pie.seru,  it  is  better  to  work  on 
the  latter  lust  so  as  not  to  undo  polishing  by  later  grinding.  Similarly,  >1  the  goal  ol  t  educe -weight  o* 
die  glass  and  t.v  if  round  concave  are  twill  present,  it  may  prove  unnecessary  to  Jo  more  than  gntnl,  as 
that  reduces  weight  as  a  side  eSI>  ct  ot  grinding  a  wav  some  >1  die  glass  in  the  pi  or  ess  ol  making  it 
concave.  Such  uiteiactnws  have  been  investigated  beiorr,  albeit  if  not  in  a  vr  r,  systematic  niarmei 
(Sncerdou.  1*>77, aiboneli,  I  OK  1 ,  Witensky,  I  OKI).  Meie  we  are  focusing  cn  an  integral  ed  an  nins  true 
to  acqniie.  knowledge  of  plan  miraac  nous  through  observation  ol  die  roitseriitciK  es  <•:  Ms  a,.  te  ir.  mi  the 
external  envunnment,  and  when  ttecessarv  thtough  incused  experiment  mon. 


*  When  multiple  operators  m.c 
-s  Inch  one, s)  to  apply  ihis 


he  i  hosen  in  order  to  make  piogress  iow.smI  the  ,u 
die  -uandaui  role  ■  a  a  heunsne  e'-aluation  diiKitou 
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we  propose  to  do  the  selection  by  compiling  explicit  symbolic  reasoning,  rather  than  a-priori  numerical 
metrics.  The  notion  of  learning  operator  preferences  in  the  context  of  an  active  goal  was  the  central  task 
of  LEX  (Mitchell  et  al ,  1983),  and  is  one  of  the  major  effects  of  chunking  and  universal  subgoaling  in 
SOAR  (Lairc!  et  al,  1986).  At  one  end  of  the  spectrum  one  can  view  a  string  of  purely  deterministic 
preferences  as  equivalent  to  a  linear  macro-operator  (Fikes,  1971,  Minton,  1985,  Cheng  &  Carbonell, 

1986),  and  at  the  other  extreme  as  guiding  search  in  preferential  directions  based  on  past  experience. 

•  When  multiple  objects  may  be  chosen  on  which  to  apply  the  operators,  determine  which  one(s)  to 
select.  Again,  these  can  be  categorical  (polishing  and  aluminizing  the  wrong  surface  of  a  mirror  will 
never  yield  desired  results)  or  preferential  (choosing  a  fast  rough-grinding  tool,  vs  choosing  a  slow 
fine-grinding  one,  vs  choosing  both  -  the  former  for  rough  shaping,  followed  by  the  latter  for  fine 
adjustment).  Preferences  may  be  stated  in  terms  of  achieving  higher  quality  plans  (more  efficient  ones 
to  execute,  or  ones  more  likely  to  succeed),  or  in  terms  of  minimizing  planning  effort  (producing  a 
working  solution  quickly,  even  if  it  may  be  far  from  an  optimal  plan). 

These  decision  points  serve  a  dual  role  in  prodigy:  Learning  control  rules  to  make  die  right  decisions  (Minton  et  al, 
1987),  and  providing  the  handle  for  the  experimentation  module  to  direct  the  problem  solver  when  it  must  perform 
actions  to  seek  new  knowledge  before  returning  to  the  problem  at  hand 


5.  Types  of  Knowledge  Acquired 

A  domain  theory  of  die  world  can  be  incomplete  in  several  different  senses: 

•  Factual  properties  of  objects  in  the  world  could  be  missing  (size,  color,  category,  functional  properties, 
etc.) 

•  Entire  operators  could  be  miss.ng  -  die  planner  may  not  know  all  its  capabilities. 

•  Operators  could  be  partially  specified  -  the  planner  may  know  omy  some  of  their  preconditions  and 
some  of  their  consequences. 

•  Interactions  among  operators  could  be  unknown,  causing  planning  fmlm  -s  or  planning  inefficiencies. 

Thus  far  we  have  worked  on  operator  refinement  addressing  only  the  latter  two  categories  of  missing  knowledge. 
Learning  control  knowledge  to  cope  with  certain  kinds  of  operator  interactions  in  prodigy  is  discussed  in  (Minton 
et  at,  l‘>87),  and  illustrated  m  our  detailed  example  <  !u.'  methods  lor  acquit  u.g  die  missing  pre  and  post  conditions 
ot  operators  are  summarized  in  the  table  below,  and  elaborated  u>  die  detailed  example  that  follows.  In  essence, 
plan  execution  failures  logger  die  cxpcnincniaiion  and  replanning  process.  Tin  is,  each  method  is  indexed  be  die 
I  ail  nre  undition  to  which  il  applies,  encoded  as  differences  between  exjrccied  and  observed  outcomes. 
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EXPECTED 

OUTCOME 

OBSERVED 

BEHAVIOR 

RECOVERY 

STRATEGY 

LEARNING  METHOD 
(EXPERIMENT  GENERATOR) 

all  the  known 
preconditions 
satisfied 
earlier 

at  least  one 
precondition 
is  violated 
at  present 

plan  to 
achieve 
the  missing 
precondition 

binary  search  on  operator 
sequence  from  establishment 
of  precondition  to  present, 
adding  negated  precondition 
as  postcondition  of  the 
culprit  operator 

all  the  known 
preconditions 
satisfied 
earlier 

all  the  known 
preconditions 
satisfied 
but  operator 
fails  to  apply; 
postconditions 
remain  undone 

attempt,  to 
plan  without 
this  operator, 
or  failing 
that,  suspend 
plan  till  the 
experiment  is 
complete 

compare  prasant  failure 
to  the  last  time  operator 
applied  successfully, 
generating  in  a  binary 
search  intermediate  world 
description®  to  identify 
the  necessary  part  of  the 
state,  adding  it  to  the 
operator  preconditions 

operator 
applies  and 
all  the 

postconditions 
are  satisfied 

at  least  one 
postcondition 
fails  to  be 
sat is  f iad 

if  the  ururwut 
postcondition 
is  incidental 
ignores  it 
but.  if  it  is 
a  goal  state 
try  different 
operator <  a ) 

compare  to  last  trine  all 
postconditions  were  mat, 
perform  binary  search  on 
world  state  to  determine 
necessary  part  to  achieve 
all  postconditions  -  then 
replace  operator  with  two 
new  ones:  one  with  th®  new 

prer  ndifcion  arid  *11  the 
post  conditions ,  t  he  other 
with  the  now  precondition 
negated  and  without  the 
poa  ».  cond  it  ion  in  question 


(>.  I.iiutnmjj  by  Kx|H‘! imcnbMion:  A  iMdikui  liatuink' 

!,a'[  us  itiurn  loom  lelt.MOfx-  imiioi  mirtipk',  amt  ism.iiu-  dial  w  li,i\c  oo.U  .1  p  un.i!  Join. mi  ih.’oiA  and  viruuils 
no  control  knowledge.  How  tail  I'Koiiit.Y  through  it;  ..»<  jom  i ;  >1  \  10  •...live  die  pti  'Mem  l.-.nii  u>  ulm  I  *  ■  u  ■  t  die  nr  u 
nine  '  ('.in  leariniig  iv  i.m|>nivol  tv  1  <  ■ : » n  1  ■  1 . 1 1 1 ;  1  >•  •aiori' -l  >  in  ■  (  h!  the  ike  .u..|nn  ,ug  kiiimk.’ge.  m  .iJouuhi  lo 
(HU  suing  c\ teriutlv  g;\  fit  fil-.k  ■ Siippo  -.e  a c  >ui  1  Willi  the  h  iliov.  mg  i  gie.ttlv  -.i.npiilie  1 1  »  mm,  led  re  I'm 
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OPERATORS  “  PRECONDITIONS  CONSEQUENCES 


1) 

GRIND “CONCAVE (Cob j>) 

ISA (Cob j>,  solid) 

IS -CONCAVE (Cob3>) 

2) 

POLISH (<obj>) 

ISA(Cobj>,  glass*) 
IS-CLEAN (Cob j>) 

IS-R  CSHED (Cob j>) 

3/ 

ALUMINIZE (Cob j>) 

IS -CLEAN (Cob j>) 

ISA (Cob i>,  solid) 

IS -REFLECTIVE (Cob j>) 

4) 

CLEAN  ( Cob  :j>) 

ISA (Cob j>,  solid) 

IS-CLEAN ( Cob j>) 

INFERENCE  RULES, 

1) 

is -reflective  (<ob  j>) 

i,  IS-POLISHKD (Cob j>)  - 

->  IS-MIRROR(<obj>) 

2) 

IS— MIRROR.  (Cob  j>)  *  IS 

-CONCAVE  (Cob 3>)  -  ••>  IS 

-TELESCOPE-MIRROR (Cob  j>) 

Given  ihe  operators  and  inference  rules  above,  Set  us  suppose  that  the  goal  of  producing  a  telescope  minor  arises, 
and  we  have  a  glass  blanks  and  a  wood  pieces  u>  work  with,  none  of  (hern  with  clean  or  polished  surfaces.  I’KOPKJV' 
starts  backchaimng  by  matching  die  goal  state  against  the  right  hand  side  ot  operators  and  inference  rales, 
concluding  that  in  order  to  make:  a  telescope  mirror  it  should  first  make  a  mirror,  and  then  make  its  shape  concave. 
Then  seeing  how  to  make  a  minor,  it  concludes  that  it  should  make  it  reflective  and  then  polish  it  (by  matching 
IS- MIRROR  again,  t  the  right  hand  side  ot  the  second  inference  rule)  I.et  us  issurne  for  now  dial  I’KOOtoy  correct  h 
selected  the  glass  blank  (it  was  listed  first i  as  die  starting  object  Now'  it  must  apply  the  operator  ALUMINIZE  to 
the  glass,  which  requires  tliat  it  lx-  a  solid  (see  figure  e>- 1  for  tlie  object  hiei.uchv).  and  iliat  it  lie  clean  The  first 
precondition  is  satisfied  (glass  is  a  solid),  and  die  second  one  reqiutes  applying  die  CLEAN  operator,  which 
succeeds  because  any  solid  dung  may  he  cleaned  i  hese  successes  enable  tike  AI..UMINIZE  opciatoi  to  apply 
succcsslully.  art!  go  on  to  ihc  next  god  iri  the  coniunctive  subitoal  set  IS  POUSIIFD  'see  figure  b  -’l  Tims  f,»i 
die  nr  have  been  no  >,ui puses  and  no  learning,  (list  locally  successtui  jvrrtormsiu  -c 
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However,  whereas  prodigy  believed  that  the  POLISH  operator  preconditions  were  satisfied  (it  believes  in 
temporal  persistence  of  states,  such  as  IS-CLEAN.  unless  it  learns  otherwise),  the  environment  states  the  contrary 
the  glass  is  not  clean.  The  first  learning  step  occurs  m  the  attribution  of  this  state  change  to  one  of  the  actions  that 
occurred  since  the  state  1S-CLEANED  was  brought  about.  Since  there  was  only  one  intervening  operator  invocation 
( ALUMINIZE),  it  infers  that  a  previously  unknown  consequence  of  this  operator  is  -iS-CLEAN  (meaning  retracting 
IS-CLEAN  from  the  current  state).  If  there  had  been  many  intermediate  operators,  specific  experiments  to  perform 
some  but  not  other  steps  would  have  been  required  to  isolate  the  culprit  operator.  After  applying  the  CLEAN 
operator  once  more,  it  again  attempts  to  POLISH,  but  the  operator  does  not  result  m  die  expected  state  IS- 
POLISHED.  This  means  that  either  it  is  missing  some  knowledge  (some  other  precondition  for  POLISH  is 
required),  or  its  existing  knowledge  is  incorrect  < IS-POLISHED  is  not  a  consequence  of  POLISH).  Always 
preferring  to  believe  its  knowledge  correct  unless  forced  otherwise,  it  prefers  to  examine  the  former  alternative. 
Bui,  bow  can  it  determine  what  precondition  could  be  missing'.’ 


COAL:  i.S-' TELESCOPE  MIRROR 


(Rill. 


IS  MIRROR 


IS  <  '( )N<  ’AVI . 


■RLt 


i  <  IK  INI)  t  ONi  '  \V( 


IS  ill  I  I  I  v'l  |Yt- 


is  roi  isiii  i' 


M.I 


l’<  !|  iSil 


is  t  1  ;  \\  l.s  v  li.l!  I  IS  .  -|  I  AN 


■td  \sS  IS  s' I  I 


9 


The  only  possibilities  are  un-aluminizcd  dirty  glass  blinks  and  dirty  wood  blanks.  Only  glass  can  be  polished  (sec 
the  precondition  tabic),  and  all  die  glass  blanks  are  identical  to  each  other,  but  different  from  the  current  object  in 
that  they  arc  both  dirty  and  unaluminized,  so  it  choses  a  glass  blank.  After  cleaning  it,  die  POLISH  operator 
succeeds,  and  once  again  is  must  establish  a  reason  for  the  operator  succeeding  this  time,  but  failing  earlier:  the  only 
difference  is  the  glass  not,  being  aluminized.  Thus  a  new  precondition  for  POLISH  is  learned  as  a  result  of  a  simple, 
directed  experiment;  -IS-REFLECTIVE(<OBJ>).  meaning  that  once  coated  with  aluminum,  the  substrate  substance 
cannot  Ire  polished. 

Now  back  to  the  problem  at  hand.  In  order  to  POLISH  the  glass  it  must  unaluminize  it,  but  there  is  no  known 
operator  that  removes  aluminum.6  So  the  IS -POLISHED  subgoal  fads,  and  failure  propagates  to  the  IS-MIRRQR 
subgoai,  with  the  cause  of  failure  being  that  the  IS-REFLECTIVE  prevented  POLISH  from  applying.  Here  there  is 
a  goal  interaction7  that  can  be  solved  by  reordering  the  interacting  components: 

If  the  cause  of failure  of  one  conjunctive  subgoal  is  a  consequence  of  cm  operator  in  an  earlier  subgoal  in 

the  same  conjunctive  set,  try  reordering  the  sub  goats. 

That  heuristic  succeeds  by  PQLISHing  before  ALUMINIZing.  Having  obtained  success  in  one  ordering  and  failure 
in  another,  the  system  tries  to  prove  to  itself  that  this  ordering  is  alwsys  required,  and  succeeds  by  constructing  the 
prtxif:  ALUMINIZE  will  always  produce  IS-REFLECTIVE  which  blocks  POLISH,  and  since  there  are  no  other 
known  ways  to  achieve  IS-POLISHED,  failure  is  guaranteed.  The  present  version  of  PRODIGY  is  capable  of 
producing  such  proofs  in  failure-driven  EBL  mode  (Minton  &  Carbonell,  1987).  Thus,  a  goal -ordering  control  rule 
is  acquired  for  this  domain;  always  choose  POLISH  before  ALUMINIZE,  if  both  are  in  the  same  conjunctive  goal 
set  and  both  apply  to  the  same  object. 


Nov/,  once  again,  back  to  die  problem  at  hand.  The  system  tries  again  and  succeeds  in  producing  a  mirror,  but 
now  needs  to  make  it  concave.  The  only  operator  to  make  IS-CONCAVE  uuc  is  GRIND-CONCAVE,  Its  only 
precondition  is  that  the  object  be  solid,  and  so  it  applies.  At  this  point  the  system  checks  whether  it  finally  lias 
achieved  the  top-level  goal  IS-TELESCOPf’-MIRROR.  and  discovers  (much  to  its  dismay,  were  it  capable  of 
emotions),  that  all  its  work  on  PQLISHing  and  ALUMINfZing  has  disappeared.  The  only  operator  that  applied 
since  the  mirror  was  polished  and  aluminized  »x  GRIND-CONCAVE,  and  so  it  learns  two  new  consequences  for 
GRIND-CONCAVE:  -IS- POLISHED  and  ~?S -REFLECTIVE.  No  explicit'  experiment  was  needed  as  only  one 
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if  a  result  of  a  subgoal  was  undone  when  pursuing  a  Utter  subgoal  i  t  the  same  con /tint: live  set.  try 
reordering  these  two  subgoals. 

So,  PRODIGY  goes  off  and  tries  the  experiment  of  achieving  IS-CONCAVE  Ik.  fore  achieving  IS -MIRROR,  resulting 
in  a  more  efficient  plan.1*  A  proof  process  would  again  be  invoked  to  determine  whether  to  make  it  a  reordering 
rule,  concluding  that  it  is  always  better  to  achieve  IS-CONCAVE  first.  The  chart  below,  summarizes  the  new 
knowledge  acquired  (in  italics)  as  a  result  of  the  problem  solving  episodes,  experiments,  and  proofs.  Such  ;s  the 
process  of  fleshing  out  incomplete  domain  and  control  knowledge  through  experience  and  focused  interaction  with 
the  task  environment  Although  in  the  example  ail  the  preconditions  are  consequences  learned  are  negated 
predicates,  the  same  process  applies  to  acquiring  simple  atomic  predicates.  However,  the  process  of  acquiring 
logical  combinations  of  atomic  predicates  is  significantly  more  complex. 


OPERATORS  "PRECONDITIONS  CONSEQUENCES 


X)  GRIND -CONCAVE  «ob  j» 

ISA (<ob j>,  solid) 

I S -CONCAVE ( <ob i » 
~lS-POUSHED(<obj> ,) 
-IS-REFLF.  C.rr/Ef  <obj> ) 

2)  POLISH (<obj>) 

ISA (Cob j>,  gl.sias) 
IS-CLEAN (<obj>) 

~iS-REFLECTIVE(  <obj» 

IS -POLISHED (Cob j^) 

3)  ALUMINIZE (<obj>) 

IS -CLEAN (<obj>) 

ISA  (Cob  j>,  solid) 

IS -REFLECTIVE  (<ob  j» 
~tSCLEAN(  <obj> ) 

4)  CLEAN (<ob j>) 

ISA (Cob j>,  solid) 

IS -CLEAN (Cob j>) 

INFERENCES: 

I )  Id -REFLECTIVE (<obj>) 

&  IS -POLISHED (<obj>)  -• 

->  IS  -MIRROR (Cob j>) 

2)  IS-  MIRROR ( Cob j  > )  &  I 

S -CONCAVE  <<obj>)  -~>  IS- 

■TELESCOPE -MIRROR ( Cob j>) 

NEWLY  ACQUIRED  CONTROI 

!  RULES  (tit  SUBGOAL  OR  I)  El 

O'NG: 

.1 5  Seiner  IS-PQl  lSiit'l>(<  chj>)  before  IS-REEt .ECTP2f.'(  <obj> )  if  both  are 
present  tn  the  some  conjunctive  nibyoal  set 

2)  Select  IS-CONCAVEt  <ob/>)  before  fS-Mflift  "■«(  <ohf>)  if  both  are 
present  tn  the  same  conjunctive  sub  goal  set. 
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7.  The  Current  Implementation 

rhc  operator  refinement  strategy  lias  been  implemented  in  a  subset  of  proda  'V  augmented  with  an  execution 
monitoring  component.  We  plan  to  integrate  both  execution  monitoring  and  experimentation  into  the  full  prodigy 
system  shortly.  To  handle  operator  and  object  hierarchies,  we  arc  representing  operators  and  other  domain 
knowledge  using  Framekit  (Carbone!!  &.  Joseph,  1986),  a  frame-based  knowledge  representation  system. 

The  planner,  execution  monitor,  and  experiment  proposer  combine  three  sources  of  dynamic  knowledge: 

«  The  state  of  the  plan  being  developed  and  its  partial  execution. 

•  prodigy’s  expectations  of  the  current  status  of  the  external  world. 

•  The  observed  status  of  the  external  world,  including  divergences  from  expectations  as  determined,  by 
the  execution  monitor. 

Since  prodigy  is  not  yet  connected  to  an  external  robot  or  to  the  world  modelers  simulation  environment 
(Carbonell  &  Hood,  1986),  execution  monitoring  proceeds  by  interrogating  die  user  about  aspects  of  the  external 
state  it  deems  relevant.  These  aspects  consist  of  expected  changes  brought  about  by  the  application  of  operators. 
For  instance,  the  system  checks  that  expected  consequences  of  operators  have  come  to  pass,  but  not  that  all 
supposedly  persistent  states  have  remained  untouched.  Problems  in  the  La  tier  cate  ory  come  to  light  only  when  a 
presumably  satisfied  precondition  to  a  later  operator  is  found  to  be  violated,  by  the  execution  monitor.  Then,  the 
experimentation  process  is  invoked  to  identify  which  of  Tie  candidate  intervening  actions  coo  Id  be  the  culprit 
operator,  augmenting  its  postconditions  so  that  next  time  ic  additional  change  to  the  external  is  recorded  and 
expected. 


8.  The  General  Operator-Refinement  Method 

if  the  system  is  given  complete  and  conect  knowledge,  n  uses  a  standard  problem  solving  approach.  In  particular, 
it  employs  means  ends  analysis  to  select  an  operator.  Then  the  system  subgoals  for  every  precondition  of  (lie 
operator  dial  is  not  matched  in  the  current  state.  Once  all  the  preconditions  are  matched,  the  planner  updates  die  state 
with  she  postconditions  of  the  operator. 


With  'incomplete  knowledge,  however,  the  system  co 
Tsoepivicy  With  {he  internal  state.  When  a  di'icretvm. : 


annually  monitors  the  outside  world  u>  check,  lor  any 
arses,  s'audhid  problem  solving  has  to  t>e  modiliri  as 
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THE  OPERATOR  REFINEMENT  METHOD 
For  every  operator  O  selected 


for  every  precondition  P  of  operator  O 
if  State(P)  5*  World(P)9 
then  One  of  the  operators  previously  applied 
since  P  was  established  has  a 
previously  unknown  postcondition. 


1 )  Select  candidate  operators.  The  candidate 
operators  are  all  that  were,  applied  between 
the  last  time  that  P  was  checked  in  tiie  World 
and  the  current  check. 

2)  Identify  responsible  operator.  Formulate 
experiments  by  selecting  an  operator  in  a 
binary  search  over  the  candidate  operators, 
applying  it.  and  then  checking  P  in  the  World.  If  as  a 
result  of  an  experiment  wiih  operator  ()£,  P  is 
unexpectedly  changed  in  the  World,  P  is  a  new 
postcondition  of  0£. 

3)  Add  P  as  a  new  postcondition  of  operator  0F. 


for  every  postcondition  P  of  oper  ator  O 
if  State(P)  54  World(P) 
then 


if  3  Q  precondition  of  O  such  that  State(Q)  *  World(Q) 
then  One  of  the  operators  previously  applied 
since  Q  was  established  should  have  had  a 
postcondition  affecting  Q„ 


kSE  2 


1)  Select  candidate  operators.  The  candidate 
operators  are  all  that  were  applied  between 

the  last  time  that  Q  was  chtvkc  <1  and  the  current  i  hcck. 

2)  Identify  responsible  operators.  Formulate 
experiments  by  selecting  an  operator  in  a 
binary  search  over  the  candidate  operators. 

Each  experiment  will  is-ssist  of  applying  one  of  the 
OjX'raioi  s  and  then  check  <J  in  the  World,  If  as  a 
result  ni  an  CApenment  with  operator  Q  is 
unex|Kx:iediy  changed  in  the  World,  Q  is  a  new 
ixsstcoiiditioii  of  C.b  . 

1)  Add  «2  as  a  new  postcondition  of  operator  < ),. 
if  V  preconditions  Q  of  OSlate(Q)  World(O) 
then  A  precondition  of  operator  O  might  he  missing,  |  (.‘A?: 


I )  Select  candidate  pn-cimditioris  I  lie  candidate 
set  is  (tinned  with  all  the  dd termers  between 
any  state  m  w!m  h « >  was  applied  successful!  v 
an. I  As  a'  •  .i  .;.>te  <  iiwnti  t  f.siul  nppfn  am.  >n  of  •  o 
‘.dtaic i  ■'  rre. aana,  | o  ■  •  o h  1 1 1 n  in  i  ni!?! alt' 

.scam.-  ,  e  wig  rui!  ■■  .  .r  .  h  •  .ri'  the  i  . a 

■■  aiduia'--  y  •  .  en-.titioi!  •  ind  .  &>**  •.  i!V 
!•'  ih,;i  the  r t- . ! ; 1  i •:  rim  v  k;  is  the  nv.e'ii  c.r 
..  ai  1 1 C ct  as  til-  nc\  i  i re' . ,.,/f : 1. 1 ! i a ii I  ot  V.'. 


.‘sire*  weak 


ivhKWrtfwfttM 


.  SMWfl 
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In  all  of  the  above  cases,  the  system  attempts  to  recover  and  fix  the  plan,  using  the  new  information  learned.  In 

addition  we  use  the  following  heuristics  for  cases  of  goal  interaction  and  plan  optimization: 

If  the  cause  of  failure  of  one  conjunctive  subgoal  is  a  consequence 
of  an  operator  in  an  earlier  subgoal  in  the  same  conjunctive  set. 
try  reordering  the  subgoals. 

If  a  result  of  a  subgoal  was  undone  when  pursuing  a  later 
subgoal  in  tire  same  conjunctive  set.  try  reordering  these  two  subgoals 


9.  Concluding  Remarks:  Beyond  Simple  Experimentation 
More  comprehensive  learning  could  occur  by  s.temptmg  to  generalize  the  newly  acquired  preconditions  and 
consequences  to  other  sibling  operators  in  the  operator  hierarchy  (see  figure  9-1).  For  instance,  the  newly  learned 
consequences  of  destroying  a  polislied  or  aluminized  surface  apply  not  just  to  GRIND-CONCAVE,  but  to  any 
GRIND  operation  (such  as  GRIND-CONVEX,  GRIND-PLANAR)  However,  these  coasequences  do  not  apply  to 
other  RESHAPE  operations  such  as  BEND.  COMPRESS,  etc.  The  process  to  determine  the  appropriate  level  of 
generalization  again  requires  experimentation  (or  asking  focused  questions  to  a  human  expert).  For  instance, 
observing  the  consequences  of  GPJND  PLANAR  on  a  previously  aluminized  mirror,  provides  evidence  that  all 
GRINDS  behave  alike  with  respect  to  destroying  surface  attributes,  and  observing  the  consequences  oi  bending  a 
polished  reflective  glass  tube  without  adverse  effects  on  surface  attributes  prevents  generalization  above  GRIND. 


□PER  ATOR 


OBJECT- PREP 


MOVE 

A 


olJRFACL  PREP 


GRIND 

i 

I 


i  i 

■’O.l.ISH  PAINT ALUMINC/.H  GRIND-  GRIND-  GRIND 

PLANAR  CONCAVE  CONVEX 


"I  Ml  CARRY 


I  (gut  i,‘  0-  1 ,  I  Yagmetit  oi  npeiatrn  "isa"  iticiaic  liv 

in  ;ukiiiion  i-i  |t.'opt,.si(tg  c vj v \ mienix  n>  guttle  uenetvtb, ration.  .nr  stalling  to  .tivesugatc  tnulenljs  between 


14 


and  little  if  any  a-priori  control  knowledge  The  impact  of  this  work  should  be  felt  in  robotic  and  other  autonomous 
planning  domains,  as  well  as  in  expert  systems  that  must  deal  with  a  potentially  changing  environment  of  which 
they  cannot  possibly  have  complete  and  accurate  knowledge  beforehand. 
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Appendix:  Annotated  Program  Trace 

Wi  include  here  a  trace  of  our  program.  The  example  is  the  same  as  in  section  .  The  initial  state  is: 

(initial- state  ((is-glass  glass  1) 

(is-soiid  glass!) 

(is -planar  glass  1) 

(is-glass  glass?) 

(is -.solid  glass?.) 

(is-planar  glass.2) 

(is-wood  woodl))) 

and  the  goal  is  (is-teieseope-mirroi  glassl). 

The  trace  gives  .several  pieces  of  information  about  every  operator  O: 

»  When  the  operator  ss  selected:  "Trying  operator  O". 

•  When  a  precondition  P  is  checked  in  the  internal  state  and  external  world:  "Checking  for 
precondition  P". 

*  When  all  die  preconditions  have  been  matched:  "AJi  preconditions  checked,  the  operator  O  is  being 
applied". 

9  When  a  postcondition  P  is  being  checked  in  the  internal  state  and  in  the  external  world:  "Checking 
for  postcondition  P'\ 

Every  time  a  precondition  or  a  postcondition  is  checked,  the  results  of  die  checks  with  the  internal  state  of  the 
planner  and  the  external  world  are  shown.  The  system  itself  finds  the  information  about  the  internal  status,  but 
the  user  has  to  provide  the  result  of  the  check  with  the  external  world. 

Only  the  interesting  parts  of  the  trace  hav  -  ten  included.  It  has  ten  commented  at  some  points  to  make  it  morn 
readable. 
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(goal-state  (is  telescope -mirror  glass  1 )) 

Trving  operator  (is-tclescope-rnirror  glass  1 ) 

Checking  for  precondition  ( is -mirror  glass  1 ) 
Internal  State:  n  Simulated  World:  n 

Trying  operator  (is-rrurror  glass I) 


;;;  Solving  the  subgoal  (is-retlective) 


Trying  operator  (polish  glass  1) 

Checking  for  precondition  (is-clean  glassl) 

Internal  State:  v  Simulated  World:  n  CASH  1 

***  Experimentation  triggered 

***■  New  postcondition:  (not  (is-clean  glassl)) 

***  Candidate  operators:  ((aluminize)) 


The  postcondition  (not  (is-clean  glassl)) 
is  being  added  to  the  operator  aluminize 


;;;  Solving  (is-clesn  glassl) 


All  preconditions  checked, 

the  operator  (polish  glassl  )  is  being  applied 

Checking  for  postcondition  (i$ -polished  glass) ) 

Internal  State  n  Simulated  World:  y  ;;;  CASE  S 

Discrepancy  between  state  and  world  (Case  2) 

Checking  again  for  pncctffldhioa  (is- glass  glassl  > 
internal  State  y  Simulated  World,  y 

( "flecking  again  for  precondition  (is  clean  glass!  i 

I mernitt  Slote :  v  SttTittlsuefi  \Vcrfit  y 

"■*"  i  -  S|»:.itriir.nl;ino'i  ttigi'-ivd 

!!,U  !  polish : 

-htfen  i»  iwtivi  •  u>ti  ui  -c-ui  ‘iat 
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The  precondition  (not  (is~refleetive  glass!)) 
is  being  added  to  tbs  operator  polish 


Retrying  operator  (polish  glass  1 ) 

Checking  for  precondition  (is-glass  glass  1) 

Internal  State:  y  Simulated  World:  y 

Checking  for  precondition  (not  (is-rcfiective  glass  1)) 

Internal  State:  n  Simulated  World:  n 

;;;  There  is  no  operator  to  achieve  the  goal  (not  (is- reflective  glass  1». 
;;;  At  this  point  the  system  hypothesizes  that  there  is  a  goal  interaction 
;;;  and  applies  the  corresponding  heuristic. 


New  goal  preference:  Prefer  is><poUiJhed  over  is-reflective 


;;;  Since  the  system  doesn’t  know  bow  to  make  glass  1  not  reflective, 
;;;  it  restarts  line  process  with  the  glass  that  looks  more  like 
;;;  glass!,  which  is  glass2 

The  subgoal  is-miiTor  us  solved  again,  hut  this  time  considering 
;;;  die  new  goal  preference  rule  and  using  the  refined  operators. 

;;;  Then  ihe  subgoal  is-concave  is  solved. 


All  precondition. s  checked. 

the  operator  (is-ielescope-mirror  glass! )  is  oemg  applied 

Checking  for  postcondition  (is  telescope  mirroi  glass!) 
internal  State  v  Simulated  World  n  CASE  2 

Cheek  mg  again  lor  prei:oik.iiiK»i  ns  mirror  glass!) 

Internal  State  v  Simulated  World  n 

Checking  again  for  precondition  (is- concave  glass.:) 
Internal  State:  v  Simulated  World  y 

( 1  treking  again  lo»  precondition  us-rellective  glass!) 
Internal  State  v  Simulated  World  it 

( ''wick ing  again  (or  precondition  fit- polished  glass.' ) 


[K  f 
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•**  New  postcondition:  (not  (is -polished)) 

**"  Candidate  operators:  ((aluminize)  (grind-concave)) 

The  postcondition  (not  (is-palished  gfass2)) 
is  being  added  Jo  the  operator  grind-concave 


;;;  lire  subgoal  is- mirror  is  solved  again, 

;;;  and  the  system  can  finally  make  a  telescope  mirror. 

Operator  (is-teiescope -mirror  glass2)  successfully  applied 


SuccessU 

(plan  ((clean  glass?) 

(polish  glass 2) 

(aluminize  glass2) 

(is-tnirror  glass2) 

(grind  -concave  glass2) 

(clean  giass2) 

(poiish  glass?) 

(aluminize  giass2) 

(is-nnrror  glass2) 
(is-tclescope-mirror  g!ass2))) 


Non  optimal  plan: 

;;;  The  system  applies  the  plan  optimization  heuristic: 


New  goal  preference:  Prefer  grind-concave  over  is-mii  ror 


Find  of  trace. 


